20 research outputs found
DocSCAN: Unsupervised Text Classification via Learning from Neighbors
We introduce DocSCAN, a completely unsupervised text classification approach
using Semantic Clustering by Adopting Nearest-Neighbors (SCAN). For each
document, we obtain semantically informative vectors from a large pre-trained
language model. Similar documents have proximate vectors, so neighbors in the
representation space tend to share topic labels. Our learnable clustering
approach uses pairs of neighboring datapoints as a weak learning signal. The
proposed approach learns to assign classes to the whole dataset without
provided ground-truth labels. On five topic classification benchmarks, we
improve on various unsupervised baselines by a large margin. In datasets with
relatively few and balanced outcome classes, DocSCAN approaches the performance
of supervised classification. The method fails for other types of
classification, such as sentiment analysis, pointing to important conceptual
and practical differences between classifying images and texts.Comment: in Proceedings of the 18th Conference on Natural Language Processing
(KONVENS 2022). Potsdam, German
The Law and NLP: Bridging Disciplinary Disconnects
Legal practice is intrinsically rooted in the fabric of language, yet legal
practitioners and scholars have been slow to adopt tools from natural language
processing (NLP). At the same time, the legal system is experiencing an access
to justice crisis, which could be partially alleviated with NLP. In this
position paper, we argue that the slow uptake of NLP in legal practice is
exacerbated by a disconnect between the needs of the legal community and the
focus of NLP researchers. In a review of recent trends in the legal NLP
literature, we find limited overlap between the legal NLP community and legal
academia. Our interpretation is that some of the most popular legal NLP tasks
fail to address the needs of legal practitioners. We discuss examples of legal
NLP tasks that promise to bridge disciplinary disconnects and highlight
interesting areas for legal NLP research that remain underexplored
Enhancing Public Understanding of Court Opinions with Automated Summarizers
Written judicial opinions are an important tool for building public trust in
court decisions, yet they can be difficult for non-experts to understand. We
present a pipeline for using an AI assistant to generate simplified summaries
of judicial opinions. These are more accessible to the public and more easily
understood by non-experts, We show in a survey experiment that the simplified
summaries help respondents understand the key features of a ruling. We discuss
how to integrate legal domain knowledge into studies using large language
models. Our results suggest a role both for AI assistants to inform the public,
and for lawyers to guide the process of generating accessible summaries
Revisiting Automated Topic Model Evaluation with Large Language Models
Topic models are used to make sense of large text collections. However,
automatically evaluating topic model output and determining the optimal number
of topics both have been longstanding challenges, with no effective automated
solutions to date. This paper proposes using large language models to evaluate
such output. We find that large language models appropriately assess the
resulting topics, correlating more strongly with human judgments than existing
automated metrics. We then investigate whether we can use large language models
to automatically determine the optimal number of topics. We automatically
assign labels to documents and choosing configurations with the most pure
labels returns reasonable values for the optimal number of topics
Paradigm Shift in Sustainability Disclosure Analysis: Empowering Stakeholders with CHATREPORT, a Language Model-Based Tool
This paper introduces a novel approach to enhance Large Language Models
(LLMs) with expert knowledge to automate the analysis of corporate
sustainability reports by benchmarking them against the Task Force for
Climate-Related Financial Disclosures (TCFD) recommendations. Corporate
sustainability reports are crucial in assessing organizations' environmental
and social risks and impacts. However, analyzing these reports' vast amounts of
information makes human analysis often too costly. As a result, only a few
entities worldwide have the resources to analyze these reports, which could
lead to a lack of transparency. While AI-powered tools can automatically
analyze the data, they are prone to inaccuracies as they lack domain-specific
expertise. This paper introduces a novel approach to enhance LLMs with expert
knowledge to automate the analysis of corporate sustainability reports. We
christen our tool CHATREPORT, and apply it in a first use case to assess
corporate climate risk disclosures following the TCFD recommendations.
CHATREPORT results from collaborating with experts in climate science, finance,
economic policy, and computer science, demonstrating how domain experts can be
involved in developing AI tools. We make our prompt templates, generated data,
and scores available to the public to encourage transparency.Comment: This is a working pape
Evidence Selection as a Token-Level Prediction Task
In Automated Claim Verification, we retrieve evidence from a knowledge base to determine the veracity of a claim. Intuitively, the retrieval of the correct evidence plays a crucial role in this process. Often, evidence selection is tackled as a pairwise sentence classification task, i.e., we train a model to predict for each sentence individually whether it is evidence for a claim. In this work, we fine-tune document level transformers to extract all evidence from a Wikipedia document at once. We show that this approach performs better than a comparable model classifying sentences individually on all relevant evidence selection metrics in FEVER. Our complete pipeline building on this evidence selection procedure produces a new state-of-the-art result on FEVER, a popular claim verification benchmark
DocSCAN: Unsupervised Text Classification via Learning from Neighbors
We introduce DocSCAN, a completely unsupervised text classification approach built on the "Semantic Clustering by Adopting Nearest Neighbors" algorithm. For each document, we obtain semantically informative vectors from a large pre-trained language model. We find that similar documents have proximate vectors, so neighbors in the representation space tend to share topic labels. Our learnable clustering approach then uses pairs of neighboring datapoints as a weak learning signal to automatically learn topic assignments. On three different text classification benchmarks, we improve on various unsupervised baselines by a large margin
Political Metaphors in U.S. Governor Speeches
How do politicians use metaphors in their speeches? To provide evidence on this question, we apply a deep-learning-based metaphor detection model to a historical corpus of annual State of the State speeches given by U.S. governors, ranging from 1995 to 2022. Across 9 socio-economic topics, we present the following descriptive fi ndings. First, metaphors are most commonly used on fi scal and economic issues. Second, Democratic governors employ more metaphors on environmental issues relative to Republican governors, who in turn express more metaphors on moral values. Third, we con firm that the language used to express political metaphors is emotionally charged, with a degree of heterogeneity. Our emotion scores increase the most in presence of a metaphor on subjects related to the economy, fiscal issues, and moral values
The Choice of Knowledge Base in Automated Claim Checking
Automated claim checking is the task of determining the veracity of a claim given evidence found in a knowledge base of trustworthy facts. While previous work has taken the knowledge base as given and optimized the claim-checking pipeline, we take the opposite approach - taking the pipeline as given, we explore the choice of knowledge base. Our first insight is that a claim-checking pipeline can be transferred to a new domain of claims with access to a knowledge base from the new domain. Second, we do not find a "universally best" knowledge base - higher domain overlap of a task dataset and a knowledge base tends to produce better label accuracy. Third, combining multiple knowledge bases does not tend to improve performance beyond using the closest-domain knowledge base. Finally, we show that the claim-checking pipeline's confidence score for selecting evidence can be used to assess whether a knowledge base will perform well for a new set of claims, even in the absence of ground-truth labels